Character-Based Embedding Models and Reranking Strategies for Understanding Natural Language Meal Descriptions

نویسندگان

Mandy Korpusik

Zachary Collins

James Glass

چکیده

Character-based embedding models provide robustness for handling misspellings and typos in natural language. In this paper, we explore convolutional neural network based embedding models for handling out-of-vocabulary words in a meal description food ranking task. We demonstrate that character-based models combined with a standard word-based model improves the top-5 recall of USDA database food items from 26.3% to 30.3% on a test set of all USDA foods with typos simulated in 10% of the data. We also propose a new reranking strategy for predicting the top USDA food matches given a meal description, which significantly outperforms our prior method of n-best decoding with a finite state transducer, improving the top-5 recall on the all USDA foods task from 20.7% to 63.8%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hypotheses Selection Criteria in a Reranking Framework for Spoken Language Understanding

Reranking models have been successfully applied to many tasks of Natural Language Processing. However, there are two aspects of this approach that need a deeper investigation: (i) Assessment of hypotheses generated for reranking at classification phase: baseline models generate a list of hypotheses and these are used for reranking without any assessment; (ii) Detection of cases where reranking ...

متن کامل

Linear Reranking Model for Chinese Pinyin-to-Character Conversion

Pinyin-to-character conversion is an important task for Chinese natural language processing tasks. Previous work mainly focused on n-gram language models and machine learning approaches, or with additional hand-crafted or automatic rule-based post-processing. There are two problems unable to solve for word n-gram language model: out-of-vocabulary word recognition and long-distance grammatical c...

متن کامل

Survey on Three Reranking Models for Discriminative Parsing

This survey is inspired by the so-called reranking techniques in natural language processing (NLP). The aim of this survey is to provide an overview of three main reranking tasks particularly for discriminative parsing. We will focus on the motivation for discriminative reranking, on the three models, boosting model, support vector machine (SVM) model and voted perceptron model, on the procedur...

متن کامل

Low-Dimensional Discriminative Reranking

The accuracy of many natural language processing tasks can be improved by a reranking step, which involves selecting a single output from a list of candidate outputs generated by a baseline system. We propose a novel family of reranking algorithms based on learning separate low-dimensional embeddings of the task’s input and output spaces. This embedding is learned in such a way that prediction ...

متن کامل

Character-level Intra Attention Network for Natural Language Inference

Natural language inference (NLI) is a central problem in language understanding. End-to-end artificial neural networks have reached state-of-the-art performance in NLI field recently. In this paper, we propose Characterlevel Intra Attention Network (CIAN) for the NLI task. In our model, we use the character-level convolutional network to replace the standard word embedding layer, and we use the...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

Character-Based Embedding Models and Reranking Strategies for Understanding Natural Language Meal Descriptions

نویسندگان

چکیده

منابع مشابه

Hypotheses Selection Criteria in a Reranking Framework for Spoken Language Understanding

Linear Reranking Model for Chinese Pinyin-to-Character Conversion

Survey on Three Reranking Models for Discriminative Parsing

Low-Dimensional Discriminative Reranking

Character-level Intra Attention Network for Natural Language Inference

عنوان ژورنال:

اشتراک گذاری